Multicycle Broadcast Bypass: Too Readily Overlooked
نویسندگان
چکیده
The bypass path, also called the forwarding path, allows processors to broadcast operands from one functional unit to another more quickly than through the register file. In modern superscalar out-of-order CPUs bypass is part of the execution pipeline stage, allowing dependant instructions to issue on subsequent cycles. In these modern machines, however, the bypass network complexity is becoming a limiting factor in frequency scaling. Traditionally, architects have been unwilling to separate the execute and bypass into different stages for fear of huge IPC losses. Through cycle-time calculations and cycleaccurate simulation with Spec2000int and Mediabench, though, we show that multicycle broadcast bypass is a simple and beneficial design choice. By allowing bypassed values multiple cycles to reach their destination, processor frequency can be increased more than IPC decreases. This solution involves no repeaters, no instruction steering, and no complex control logic. At 90nm, instruction throughput increases by 9% by separating the bypass into a separate stage on a four-wide machine, and throughput increases by 16% by adding two bypass stages to an eight-wide machine.
منابع مشابه
Application-Bypass Broadcast in MPICH over GM
Processes of a parallel program can become unsynchronized, or skewed, during the course of running an application. Processes can become skewed as a result of unbalanced or asymmetric code, or through the use of heterogeneous systems, where nodes in the system have different performance characteristics, as well as random, unpredictable effects such as the processes not being started at exactly t...
متن کاملApplication-Bypas Broadcast in MPICH over GM
Processes of a parallel program can become unsynchronized, or skewed, during the course of running an application. Processes can become skewed as a result of unbalanced or asymmetric code, or through the use of heterogeneous systems, where nodes in the system have different performance characteristics, as well as random, unpredictable effects such as the processes not being started at exactly t...
متن کاملMulticycle Polling Scheduling Algorithm
The paper deals with the scheduling of periodic information flow in FieldBus environment. The scheduling problem is defined from an analytical point of view, giving a brief survey of the most well-known solutions, and illustrating multicycle polling scheduling which is based on the hypothesis that all the production period of the periodic processes to be scheduled are harmonic. Although in some...
متن کاملA comparison of laboratory findings in coronary artery bypass surgery with and without cardiopulmonary bypass
Background : Quests for doing coronary artery bypass surgery by a technique with lower complications is going on, for this aim many studies compared patients undergoing CABG with or without cardiopulmonary bypass. This study was carried out to compare laboratory findings after coronary artery bypass in these two groups of patients. Materials and Methods: In a retrospective study, 167 patients ...
متن کاملDemand-Only Broadcast: Reducing Register File and Bypass Power in Clustered Execution Cores
This paper introduces a technique called Demand-Only Broadcast that reduces the power consumption of the register file and result bypass network in a clustered execution core. With this technique, an instruction’s result is only broadcast within remote clusters if it is needed by dependants in those clusters. Demand-Only Broadcast was evaluated using a performance–power simulator of a high-perf...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004